Ranking WebPages Using Web Structure Mining Concepts

نویسنده

  • Zakaria Suliman Zubi
چکیده

With the rapid growth of the Web, users get easily lost in the rich hyper structure on the web. Providing relevant information to the users to supply to their needs is the primary goal of the owners of these websites. Web mining is one of the techniques that could help the websites owner in this direction. Web mining was categorized into three categories such as web content mining, web usage mining and web structure mining. Web structure mining plays an important role in this approach. Two page ranking algorithms such as PageRank and Hyperlink-Induced Topic Search (HITS) are commonly used in web structure mining. Both algorithms treat all links equally when distributing rank scores. A comparison between both algorithms was discussed in this paper as well. Ranking WebPages is an important mission as it assists the user look for highly ranked pages that are relevant to the query. Different metrics have been proposed to rank web pages according to their quality, and a brief discussion of the two prominent ones was conducted in this paper also. Key-Words: Web Mining, Web Content Mining, Web Usage Mining, Web Structure Mining, HITS, PageRank, Authority and Hubs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Link Analysis: Hubs and Authorities on the World Wide Web

Ranking the tens of thousands of retrieved webpages for a user query on a Web search engine such that the most informative webpages are on the top is a key information retrieval technology. A popular ranking algorithm is the HITS algorithm of Kleinberg. It explores the reinforcing interplay between authority and hub webpages on a particular topic by taking into account the structure of the web ...

متن کامل

Optimal ranking in networks with community structure

The World-Wide Web (WWW) with its enormous size (~ 10 webpages) presents a challenge for efficient information retrieval and ranking. By effectively utilizing the topological information to rank the webpages, Google became the most popular tool on the web. One important feature of the WWW is that it exhibits a strong community structure in which groups of webpages (e.g. those devoted to a commo...

متن کامل

A Synonym Based Approach of Data Mining in Search Engine Optimization

In today’s era with the rapid growth of information on the web, makes users turn to search engines as a replacement of traditional media. This makes sorting of particular information through billions of webpages and displaying the relevant data makes the task tough for the search engine. Remedy for this is SEO (Search Engine Optimization), i.e. having a website optimized in such a way that it w...

متن کامل

Ranking National Football League teams using Google's PageRank

The search engine Google uses the PageRanks of webpages to determine the order in which they are displayed as the result of a web search. In this work we expand Google’s idea of webpage ranking to ranking National Football League teams. We think of the teams as webpages and use the statistics of a football season to create the connections (links) between the teams. The objective is to be able t...

متن کامل

MapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages

In this paper, the authors discuss the MapReduce implementation of crawler, indexer and ranking algorithms in search engines. The proposed algorithms are used in search engines to retrieve results from the World Wide Web. A crawler and an indexer in a MapReduce environment are used to improve the speed of crawling and indexing. The proposed ranking algorithm is an iterative method that makes us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013